Learning Taxonomies by Dependence Maximization

نویسندگان

  • Matthew B. Blaschko
  • Arthur Gretton
چکیده

We introduce a family of unsupervised algorithms, numerical taxonomy clustering, to simultaneously cluster data, and to learn a taxonomy that encodes the relationship between the clusters. The algorithms work by maximizing the dependence between the taxonomy and the original data. The resulting taxonomy is a more informative visualization of complex data than simple clustering; in addition, taking into account the relations between different clusters is shown to substantially improve the quality of the clustering, when compared with state-ofthe-art algorithms in the literature (both spectral clustering and a previous dependence maximization approach). We demonstrate our algorithm on image and text data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection via Dependence Maximization

We introduce a framework of feature selection based on dependence maximization between the selected features and the labels of an estimation problem, using the Hilbert-Schmidt Independence Criterion. The key idea is that good features should be highly dependent on the labels. Our approach leads to a greedy procedure for feature selection. We show that a number of existing feature selectors are ...

متن کامل

Learning to integrate web taxonomies

We investigate machine learning methods for automatically integrating objects from different taxonomies into a master taxonomy. This problem is not only currently pervasive on the Web, but is also important to the emerging Semantic Web. A straightforward approach to automating this process would be to build classifiers through machine learning and then use these classifiers to classify objects ...

متن کامل

Learning Co-Substructures by Kernel Dependence Maximization

Modeling associations between items in a dataset is a problem that is frequently encountered in data and knowledge mining research. Most previous studies have simply applied a predefined fixed pattern for extracting the substructure of each item pair and then analyzed the associations between these substructures. Using such fixed patterns may not, however, capture the significant association. W...

متن کامل

Influence Maximization in Social Networks using Learning Automata

Influence maximization problem is one of the challenges in online social networks. This problem refers to finding a small set of members of a social network, by activation of whichinformation propagation can be maximized using one of the propagation models such as independent cascade model. For the maximization problem, the greedy algorithm has beenpresented which isclose to optimal response by...

متن کامل

Energy Scheduling in Power Market under Stochastic Dependence Structure

Since the emergence of power market, the target of power generating utilities has mainly switched from cost minimization to revenue maximization. They dispatch their power energy generation units in the uncertain environment of power market. As a result, multi-stage stochastic programming has been applied widely by many power generating agents as a suitable tool for dealing with self-scheduling...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008